Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[v1.7.x] update jetson dockerfile to support CUDA 10.0 #18339

Merged
merged 11 commits into from
May 26, 2020

Conversation

waytrue17
Copy link
Contributor

@waytrue17 waytrue17 commented May 16, 2020

Description

Essentials

  • Changes are complete (i.e. I finished coding on this PR)

@mxnet-bot
Copy link

Hey @waytrue17 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [centos-gpu, miscellaneous, edge, unix-gpu, sanity, website, unix-cpu, centos-cpu, windows-gpu, clang, windows-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@waytrue17
Copy link
Contributor Author

@ciyongch

@ciyongch
Copy link
Contributor

Hi @waytrue17 , #18311 is a revert PR, do you mean this PR is to fix the failed CI of #18311?

@waytrue17
Copy link
Contributor Author

Hi @waytrue17 , #18311 is a revert PR, do you mean this PR is to fix the failed CI of #18311?

Hi ciyong, yes I was trying to do so, seems it doesn't work..

@ciyongch
Copy link
Contributor

Hi @waytrue17, are you still working on this?
Seems the new PR got stuck due to file toolchains/aarch64-linux-gnu-toolchain.cmake is not existed, probably it's a new file introduced to the build system?

 Step 8/17 : COPY toolchains/aarch64-linux-gnu-toolchain.cmake /usr
 COPY failed: stat /var/lib/docker/tmp/docker-builder284184594/toolchains/aarch64-linux-gnu-toolchain.cmake: no such file or directory

@leezu
Copy link
Contributor

leezu commented May 18, 2020

You need to include ci/docker/toolchains/aarch64-linux-gnu-toolchain.cmake the file in your PR

@leezu
Copy link
Contributor

leezu commented May 18, 2020

@mxnet-bot run ci [sanity]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [sanity]

@leezu
Copy link
Contributor

leezu commented May 18, 2020

@mxnet-bot run ci [all]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-cpu, unix-gpu, centos-gpu, sanity, unix-cpu, miscellaneous, windows-gpu, centos-cpu, edge, website, clang]

@waytrue17
Copy link
Contributor Author

@mxnet-bot run ci [edge, unix-gpu, website]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [website, unix-gpu, edge]

@ciyongch
Copy link
Contributor

Hi @waytrue17 , please help to re-trigger the job of unix-gpu and website, they should pass .
While for edge job, seems CUDAToolkit_INCLUDE_DIR CUDA_CUDART can't be found in the current build env. Could NOT find CUDAToolkit (missing: CUDAToolkit_INCLUDE_DIR CUDA_CUDART), can you help to take a look at this?

@waytrue17
Copy link
Contributor Author

@mxnet-bot run ci [unix-gpu, website]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu, website]

@waytrue17
Copy link
Contributor Author

Hi @waytrue17 , please help to re-trigger the job of unix-gpu and website, they should pass .
While for edge job, seems CUDAToolkit_INCLUDE_DIR CUDA_CUDART can't be found in the current build env. Could NOT find CUDAToolkit (missing: CUDAToolkit_INCLUDE_DIR CUDA_CUDART), can you help to take a look at this?

CI jobs triggered. Sure I'll work on fix the edge

@waytrue17 waytrue17 requested a review from szha as a code owner May 21, 2020 18:58
@@ -32,7 +32,7 @@ function install_julia() {
# The julia version in Ubuntu repo is too old
# We download the tarball from the official link:
# https://julialang.org/downloads/
wget -qO $JLBINARY https://s3.amazonaws.com/julialang2/bin/linux/x64/0.7/julia-0.7.0-linux-x86_64.tar.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is not about the S3 path but that you hardcode the 0.7 version instead of using $1 and $2. In any case, hopefully https://julialang-s3.julialang.org/bin/linux/x64/$1/julia-$2-linux-x86_64.tar.gz will remain stable for the future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out. I'll replace it back to the S3 path.

@leezu leezu mentioned this pull request May 22, 2020
13 tasks
@waytrue17 waytrue17 changed the title update dockerfile for jetson [v1.7.x] update jetson dockerfile to support CUDA 10.0 May 23, 2020
@ciyongch
Copy link
Contributor

@leezu @ptrendx @TaoLv @pengzhao-intel ,please help to take a review of the code changes.

@ciyongch
Copy link
Contributor

Hi @waytrue17 , is there still any failure when enabling TVM build? Not sure if this is the best practice to fix this CI failure. Ping @leezu @ptrendx @TaoLv .

@waytrue17
Copy link
Contributor Author

Hi @waytrue17 , is there still any failure when enabling TVM build? Not sure if this is the best practice to fix this CI failure. Ping @leezu @ptrendx @TaoLv .

Hi @ciyongch, I think there are still some issues in TVM build #17840. I saw that it is disabled on master #18204, so I did the same for 1.7.x

@leezu leezu merged commit 1eefe66 into apache:v1.7.x May 26, 2020
@ciyongch
Copy link
Contributor

Thanks @waytrue17 to help fixing this failure :)

ChaiBapchya pushed a commit to ChaiBapchya/mxnet that referenced this pull request Jun 14, 2020
* update dockerfile for jetson

* add toolchain files

* update build_jetson function

* update ubuntu_julia.sh

* update FindCUDAToolkit.cmake

* Update centos7_python.sh

* revert changes on ubuntu_julia.sh

* disable TVM for gpu build

* Disable TVM_OP on GPU builds

Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>
szha pushed a commit that referenced this pull request Jun 15, 2020
…8560)

* fix centos 7 url to unblock centos-cpu & gpu pipeline

* [v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339)

* update dockerfile for jetson

* add toolchain files

* update build_jetson function

* update ubuntu_julia.sh

* update FindCUDAToolkit.cmake

* Update centos7_python.sh

* revert changes on ubuntu_julia.sh

* disable TVM for gpu build

* Disable TVM_OP on GPU builds

Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>

* skip quantized conv flaky case (#16866)

* Fix quantized concat when inputs are mixed int8 and uint8

Change-Id: I4da04bf4502425134a466823fb5f73da2d7a419b

* skip flaky test

* trigger ci

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>
Co-authored-by: Xinyu Chen <xinyu1.chen@intel.com>
ChaiBapchya pushed a commit to ChaiBapchya/mxnet that referenced this pull request Jun 30, 2020
* update dockerfile for jetson

* add toolchain files

* update build_jetson function

* update ubuntu_julia.sh

* update FindCUDAToolkit.cmake

* Update centos7_python.sh

* revert changes on ubuntu_julia.sh

* disable TVM for gpu build

* Disable TVM_OP on GPU builds

Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>
sandeep-krishnamurthy pushed a commit that referenced this pull request Jul 1, 2020
* add the missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test in runtime_functions.sh

* Revert "add the missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test in runtime_functions.sh"

This reverts commit de173b0.

* Revert "[CI][1.6.x] fix centos 7 url to unblock centos-cpu & gpu pipeline (#18560)"

This reverts commit d271348.

* fix centos 7 url to unblock centos-cpu & gpu pipeline

* skip quantized conv flaky case (#16866)

* Fix quantized concat when inputs are mixed int8 and uint8

Change-Id: I4da04bf4502425134a466823fb5f73da2d7a419b

* skip flaky test

* trigger ci

* Trigger empty commit

* [v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339)

* update dockerfile for jetson

* add toolchain files

* update build_jetson function

* update ubuntu_julia.sh

* update FindCUDAToolkit.cmake

* Update centos7_python.sh

* revert changes on ubuntu_julia.sh

* disable TVM for gpu build

* Disable TVM_OP on GPU builds

Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>

* add setuptools to ci/docker/install/requirements

* add missing build_ubuntu_gpu_cuda101_cudnn7_mkldnn_cpp_test

* add setuptool to docker & cpp-test build syntax error

* remove erroneously added cpp tests in 1.6.x

* py3 to p2

Co-authored-by: Xinyu Chen <xinyu1.chen@intel.com>
Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Wei Chu <weichu@amazon.com>
Co-authored-by: Leonard Lausen <leonard@lausen.nl>
@waytrue17 waytrue17 deleted the waytrue17-v1.7.x branch October 7, 2020 00:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants